Linux is only a kernel, and does not include any other software, including any shells or utilities. A Linux OS is a bundle of the Linux kernel with other software that makes that operating system fit for purpose. This flexibility allows the Linux kernel to be used for running many small and large systems, from smartphones to cars, farming equipment and space vehicles. The bash CLI shell is not the only available shell; one can simply install other shells as desired. The same applies to graphical user interfaces. When a graphical interface is needed on a system, you simply install one. For this topic, we are focusing on the fundamentals rather than specific software. This will help you understand the terminology and GUI requirements rather than focusing on aesthetics.
For a more detailed explanation, see the Wikipedia Article on Windowing System.
The GUI stack has several components:
The Linux kernel has all the hardware drivers needed to operate the hardware layer, including the graphics drivers, input and output, such as mouse, keyboard, touch screens, etc. The Display server is a software layer that provides the foundational means to handle input from input devices and to draw text and graphics out to the output devices. One of the most popular display servers was the X Windowing System. An implementation from X.org has been popular. Currently, most Linux distros are removing support for X and are switching to Wayland as the default display server.
Examples of Display servers include:
The Display servers communicate with the "client" software through a communication protocol. The idea is that display servers are only reference designs and can be implemented by anyone on any system. The communications protocol will abstract the low-level implemented code and libraries, and allow the "clients" to talk to any underlying display server. Additionally, communications can take place over any channel, including over a network. The ability for the display manager protocol to be functional over a network is referred to as Network transparency. Network transparency is desired for terminal servers to provide direct GUI access to remote clients, while another alternative method is the remote desktop protocols that offer pixel scraping, such as VNC.
Here are some examples of display server Protocols:
The window manager is in charge of controlling the way windows are placed on display. It must be compatible with the display server and the display server protocol. The window is the box that surrounds an application, and may or may not be visible to the user. For example, if you have played PC games, you might have seen the full-screen, windowed, and "full-screen windowed" options for launching a game. Window manager also provides the Window Decorations, which is the viewable box drawn around a window, and may have functions such as the title, minimize/maximize buttons, and resizing handles, etc.
The image below is a screenshot of the Window Manager choices that were available in the old Cinnamon desktop environment under X. One could simply pick and choose their desired Window Manager:
The X.org implemented window manager as a separate component, so users can install different window managers as long as they are X11-compliant. Most PC users are exposed to and expect a stacking or floating type window manager, such as Windows and MacOS, where windows can be moved and placed anywhere on the screen, including overlap with each other. There are also tiling window managers. Tiling window managers place windows side-by-side without allowing overlap. This is often done on touch-screen systems (industrial and automotive applications), or just adored by power users who enjoy keyboard shortcut navigation rather than having to use the mouse. In addition, a tiling-flowing-hybrid desktop may be useful for some users for large and high-resolution displays. If you are curious about what tiling window managers look like, here are two examples: GlazeWM for Windows and SWAY for Linux under Wayland.
Note: Some terminal applications, such as tmux, can imitate tiling behaviour in a single terminal using character graphics. This is extremely useful for system administrators who are performing long-running parallel tasks via SSH and would like the processes and the session to persist, even in the event of loss of network connection.
Please note that Wayland does not require an external window manager. The display server acts as both the compositor and the window manager (compositor is the component that allows visual effects such as transparency, shadows, shading, etc.). In the block diagram below, you can see the simplified architecture compared to X, where the various desktop environments' widget toolkit uses "glue code" to send graphics directly to the userspace graphics libraries (in this case, Mesa 3D). The kernel hosts the Direct Render Manager (DRM), where the low-level graphics drivers and firmware exist. (For more detailed information, see the Wikipedia Article on Wayland Protocol)
Widget toolkits are software libraries that provide the GUI components for desktop environments. These libraries are used by software developers to curate the look and functionality of their applications by using existing elements such as buttons, tabs, menus, dialogue, icons, graphics, and basically everything that can exist within a GUI application's window. While many more widget toolkits exist for various platforms, currently the GTK+ and Qt (pronounced cutie) are the most common ones used for Linux GUI applications. Both are cross-platform so applications using these libraries can be ported to Windows, MacOS, and Linux. Additionally, OS Vendors and creators of Desktop Environments provide application development tools to assist developers with visual implementations of their GUI applications. Examples of such tools include Glade from GNOME and Qt Designer as part of Qt Widgets suite of tools by the Qt Group (a publicly traded company out of Finland).
Since a Linux system can have multiple desktop environments and session types installed (which can be configured per user), there is a need for a way to select and start the correct display server and desktop environment at logon time. The display manager is the first interface that gets loaded after system startup. Usually, a display manager can provide a login screen where the user can input their credentials and select their session type. The display manager then proceeds to start the user session with the selected display server and desktop environment. When you install a graphical and desktop environment for the first time, you may be asked to install and configure an appropriate display manager.
Some of the common display managers are:
A desktop environment is a collection of software and libraries that provide a standard look and feel and behaviour to the user interface. A desktop environment may include software such as a file manager, a shell with docks, menus, and panels, and other utilities expected to exist in a desktop environment, such as date and time, calculator, etc. There are many available desktop environments to choose from. An example of a few popular desktop environments includes:
In addition to the traditional desktop form factor, other form factors run Linux and may have different requirements. The principles are largely the same, and much of the underlying code base is reused or adopted for the intended purpose of the target system. As an example, here are examples of desktop environments that specifically target smartphones and tablets:
The GUI software stacks inherited from servers and PCs may not be suitable for all types of applications. Cars, IoT devices, or industrial control appliances are not desktop PCs. However, it is interesting to observe that the same software components and designs can be reused and adapted for a wide range of applications. As an example, various automakers use the Linux Kernel in their cars. Linux kernel is powering Android Automotive and AGL (Automotive Grade Linux). While Android is proprietary, AGL is a Linux Foundation project. AGL is covering a lot of ground for "Software-defined vehicles", everything ranging from controlling the powertrain to safety and self-driving functionality. For graphical applications such as gauges and the infotainment systems, Flutter programming and Qt are currently being adopted (the same ones used to make modern apps for your desktop).
AGL-based Starlink IVI system in 2020 Subaru Legacy:
Toyota's Flutter UI contribution to AGI as reference implementation source: https://www.automotivelinux.org/announcements/quirkyquillback/